By David Wood and Bryan Foertch, Wood Associates
The dramatic growth in Electronic Document Capture (EDC) technology has expanded the industry's capability dramatically.
The market is now product- and solution-rich, but information-poor. Users are not well-educated regarding the potential they have for applying existing technology and products to their data capture problems. This paper attempts to address this need by providing a convenient "encyclopedia" of relevant information that can be used as basic education or reference by information processing professionals throughout the business world.
Each company brings its expertise, market perspective and technological focus to a different EDC-related topic. We did not constrain their editorial freedom or subject, but did encourage everyone to talk about problems and solutions rather than products or features.
The result is a diverse set of articles on topics, including:
* how to cost-justify EDC
* selecting the best scanner
* maximizing the effectiveness of your installed scanners over time
* the importance of image enhancement to your system
* how to evaluate OCR accuracy
* how to avoid obsolescence.
These articles are relevant to every user who does forms data capture today and would like to reduce the cost of that portion of their business through automation. These issues are key to successful implementation of any EDC system, transcending vendor-specific issues such as feature sets, target markets and product pricing or support.
Market Size & Growth Rate
The Electronic Document Capture industry is growing squarely in the overlap of two huge industries, key data entry and electronic document management (EDM).
Rapid technological change has opened opportunities to automate large pieces of the $12 billion-per-year key data entry market through use of image-based data entry systems. We estimate that by 2002, at least 20% of this $12 billion market will be converted to image-based data entry, either using automated data entry or manual key-from-image to capture data.
The second impacted industry is the rapidly growing EDM market, sized by Giga in 1995 at $3 billion per year with about 20% of that devoted to the EDC portion of the system.
We estimate that the EDC market today is about $600 million per year, with an annual growth rate of 50%-plus.
The Basic Architecture
EDC systems all include the same basic operations:
* image acquisition
* image enhancement and QA
* data entry
* data quality assurance
* data and image export.
Image acquisition, for example, is defined differently depending upon the size and scope of the application. For a large insurance company it may mean scanning 20,000 insurance claims per day in a single large room equipped with high-volume scanners. For a nationwide manufacturers rep firm, it may mean having a sales force faxing sales orders individually from their field laptop computers, rather than scanners.
Image enhancement and quality assurance will vary depending upon the use and intrinsic value of the image. Images being scanned for archival purposes tend to have lower image quality requirements than those being scanned for later automatic recognition, with the result that the image quality assurance process may be completely different.
Also, the appropriate image enhancement solution varies dramatically depending upon the reason for needing it. Possible uses include image cleanup for readability and performance, background removal for OCR, grayscale processing for photographs, or automatic separation of images scanned from microfilm from each other.
Data entry may be accomplished by keying from a displayed image or automatically reading by OCR (machine print), ICR (handprint), OMR (mark reading), barcode recognition or date and time endorsing. Each user must define their requirements and match them against the capabilities of each technology to determine how their system should be architected.
Data quality assurance requirements vary depending upon how expensive error repair is relative to the value of the data. Full text indexes, for example, do not require much quality assurance because occasional errors do not materially impact the ability to retrieve the data. However recognition errors on tax forms or insurance claims are unacceptable and all have to be detected and repaired before the data is acceptable.
Data and image export consists of preparing the captured information in the target database format and then transporting it to the server running the DBMS. Possible issues include selecting products which support the required output format, and ensuring that the data can be cleanly transported, either over a network or via a compatible media such as magnetic tape, optical disc or CD-ROM.
We trust that you find this paper valuable today in better understanding the issues surrounding electronic data capture systems, and tomorrow in selecting a great solution. This paper was jointly produced by Bryan Foertsch and David Wood, Partners in Wood Associates. *
David Wood, founder of Wood Associates (Boulder Creek, CO), has been involved in the image and document capture business for 15 years. He was director of marketing at Calera Recognition Systems. He was also director of marketing at Cornerstone Imaging and vice president of new business at Law Cyprus Distributing Co.
Bryan Foertsch, a partner in Wood Associates, has 20 years of experience marketing computer applications systems. He was president and CEO of Sigma Designs Imaging Systems, and held positions at GE, Octel Communications, Ashton Tate and Centennial Computer.
IW Special Supplement, March 1996
|